智能论文笔记

DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular Simulation

Duo Zhang , Hangrui Bi , Fu-Zhi Dai , Wanrun Jiang , Linfeng Zhang , Han Wang

分类：机器学习

2022-08-17

机器学习辅助建模的原子势能表面（PES）正在彻底改变分子模拟的领域。随着高质量电子结构数据的积累，可以在所有可用数据上鉴定的模型，并在下游任务上以较小的额外努力进行填充，这将使该领域进入新阶段。在这里，我们提出了DPA-1，这是一种具有新颖的注意机制的深层潜在模型，该模型非常有效地表示原子系统的构象和化学空间并学习PES。我们在许多系统上测试了DPA-1，并且与现有基准相比，观察到了卓越的性能。当在包含56个元素的大规模数据集上进行预估计时，DPA-1可以成功应用于各种下游任务，并有很大的提高样品效率。令人惊讶的是，对于不同的元素，学习的类型嵌入参数在潜在空间中形成$螺旋$，并具有自然对应的元素性表位，显示了预审预周化的DPA-1模型的有趣解释性。

translated by 谷歌翻译

A Survey on Cross-Lingual Summarization

Jiaan Wang , Fandong Meng , Duo Zheng , Yunlong Liang , Zhixu Li , Jianfeng Qu , Jie Zhou

分类：自然语言处理 | 人工智能

2022-03-23

跨语性摘要是用一种语言（例如英语）以不同语言（例如中文）生成一种语言（例如英语）的摘要。在全球化背景下，这项任务吸引了计算语言学界的越来越多的关注。然而，对于这项任务仍然缺乏全面的审查。因此，我们在该领域的数据集，方法和挑战上介绍了第一个系统的批判性审查。具体而言，我们分别根据不同的构造方法和解决方案范例仔细组织现有的数据集和方法。对于每种类型的数据集或方法，我们彻底介绍并总结了以前的努力，并将它们相互比较以提供更深入的分析。最后，我们还讨论了有希望的方向，并提供了我们的思想，以促进未来的研究。这项调查适用于跨语性摘要的初学者和专家，我们希望它将成为起点，也可以为对该领域感兴趣的研究人员和工程师提供新的想法。

translated by 谷歌翻译

Fine-grained Multi-Modal Self-Supervised Learning

Duo Wang , Salah Karout

分类：计算机视觉 | 人工智能

2021-12-22

已经显示了来自视频的多模态自我监督学习，以提高模型在各种下游任务上的性能。然而，由于未经保质数据中存在的噪声，这种自我监督的预训练需要大量批量尺寸和大量的计算资源。这部分是由于普遍存在的训练方案在粗粒设置上培训的事实，其中代表整个视频剪辑或自然语言句子的载体用于计算相似性。这种方案使训练噪声作为视频剪辑的一部分可以完全没有与其他模态输入相关联，例如文本描述。在本文中，我们提出了一种细粒度的多模态自我监督培训方案，可以计算胚胎之间的相似性（例如单独的特征地图嵌入和短语嵌入），并使用注意力来减少嘈杂对'在损失功能中加权。我们认为，通过拟议的预培训计划，我们可以培训较小的模型，具有较小的批量大小和更少的计算资源，以实现与最先进的可比性，包括行动识别和文本的任务的下游任务性能。图像检索。

translated by 谷歌翻译

Knowledge Enhanced Sports Game Summarization

Jiaan Wang , Zhixu Li , Tingyi Zhang , Duo Zheng , Jianfeng Qu , An Liu , Lei Zhao , Zhigang Chen

分类：自然语言处理 | 人工智能

2021-11-24

体育比赛摘要旨在从实时评论产生体育新闻。但是，现有数据集全部通过自动收集和清洁过程构建，导致大量噪音。此外，目前的作品忽视了现场评论和体育新闻之间的知识差距，这限制了体育比赛摘要的表现。在本文中，我们介绍了K-Sportssum，一个具有两个特征的新数据集：（1）K-Sportssum从大规模游戏中收集大量数据。它有7,854个评论新闻性对。为了提高质量，K-Sportssum采用手动清洁过程; （2）与现有数据集不同，为了缩小知识缺口，K-Sportssum进一步提供了一个大型知识语料库，其中包含523名运动队和14,724名体育运动者的信息。此外，我们还介绍了一个知识增强的摘要，它利用实时评论和知识来生成体育新闻。关于K-Sportssum和Sportssum数据集的广泛实验表明，我们的模型实现了新的最先进的表演。定性分析和人类研究进一步验证我们的模型产生更具信息丰富的体育新闻。

translated by 谷歌翻译

Modeling Explicit Concerning States for Reinforcement Learning in Visual Dialogue

Zipeng Xu , Fandong Meng , Xiaojie Wang , Duo Zheng , Chenxu Lv , Jie Zhou

分类：计算机视觉

2021-07-12

为鼓励AI代理商进行有意义的视觉对话（VD），削减了潜力的使用。在钢筋学习中，代表各国至关重要，并根据国家的行动过渡分配奖励。但是，先前的Visual Dialogue Works中的状态表示仅使用文本信息，并且其转换是隐式的。在本文中，我们建议明确关于各国（ECS）代表每轮视觉内容以及在整个视觉对话中关注的内容。 ECS由多模式信息建模，并明确表示。基于ECS，我们制定了两种直观和可意识的奖励，以鼓励视觉对话代理商对多元化和信息的视觉信息相反。根据多种自动指标，人类研究和定性分析，对VideDial V1.0数据集进行了实验结果，使我们的方法能够产生更高的视觉对话代理，以产生更高的视觉对话代理，与以前的方法相比，与以前的方法相比，可以产生更高的视觉相干，更重复和更具视觉信息的对话。

translated by 谷歌翻译

ClusTop: An unsupervised and integrated text clustering and topic extraction framework

Zhongtao Chen , Chenghu Mi , Siwei Duo , Jingfei He , Yatong Zhou

分类：自然语言处理

2023-01-03

Text clustering and topic extraction are two important tasks in text mining. Usually, these two tasks are performed separately. For topic extraction to facilitate clustering, we can first project texts into a topic space and then perform a clustering algorithm to obtain clusters. To promote topic extraction by clustering, we can first obtain clusters with a clustering algorithm and then extract cluster-specific topics. However, this naive strategy ignores the fact that text clustering and topic extraction are strongly correlated and follow a chicken-and-egg relationship. Performing them separately fails to make them mutually benefit each other to achieve the best overall performance. In this paper, we propose an unsupervised text clustering and topic extraction framework (ClusTop) which integrates text clustering and topic extraction into a unified framework and can achieve high-quality clustering result and extract topics from each cluster simultaneously. Our framework includes four components: enhanced language model training, dimensionality reduction, clustering and topic extraction, where the enhanced language model can be viewed as a bridge between clustering and topic extraction. On one hand, it provides text embeddings with a strong cluster structure which facilitates effective text clustering; on the other hand, it pays high attention on the topic related words for topic extraction because of its self-attention architecture. Moreover, the training of enhanced language model is unsupervised. Experiments on two datasets demonstrate the effectiveness of our framework and provide benchmarks for different model combinations in this framework.

translated by 谷歌翻译

Tensor Factorization via Transformed Tensor-Tensor Product for Image Alignment

Sijia Xia , Duo Qiu , Xiongjun Zhang

分类：计算机视觉

2022-12-12

In this paper, we study the problem of a batch of linearly correlated image alignment, where the observed images are deformed by some unknown domain transformations, and corrupted by additive Gaussian noise and sparse noise simultaneously. By stacking these images as the frontal slices of a third-order tensor, we propose to utilize the tensor factorization method via transformed tensor-tensor product to explore the low-rankness of the underlying tensor, which is factorized into the product of two smaller tensors via transformed tensor-tensor product under any unitary transformation. The main advantage of transformed tensor-tensor product is that its computational complexity is lower compared with the existing literature based on transformed tensor nuclear norm. Moreover, the tensor $\ell_p$ $(0<p<1)$ norm is employed to characterize the sparsity of sparse noise and the tensor Frobenius norm is adopted to model additive Gaussian noise. A generalized Gauss-Newton algorithm is designed to solve the resulting model by linearizing the domain transformations and a proximal Gauss-Seidel algorithm is developed to solve the corresponding subproblem. Furthermore, the convergence of the proximal Gauss-Seidel algorithm is established, whose convergence rate is also analyzed based on the Kurdyka-$\L$ojasiewicz property. Extensive numerical experiments on real-world image datasets are carried out to demonstrate the superior performance of the proposed method as compared to several state-of-the-art methods in both accuracy and computational time.

translated by 谷歌翻译

Generalizing LTL Instructions via Future Dependent Options

Duo Xu , Faramarz Fekri

分类：人工智能

2022-12-08

Linear temporal logic (LTL) is a widely-used task specification language which has a compositional grammar that naturally induces temporally extended behaviours across tasks, including conditionals and alternative realizations. An important problem i RL with LTL tasks is to learn task-conditioned policies which can zero-shot generalize to new LTL instructions not observed in the training. However, because symbolic observation is often lossy and LTL tasks can have long time horizon, previous works can suffer from issues such as training sampling inefficiency and infeasibility or sub-optimality of the found solutions. In order to tackle these issues, this paper proposes a novel multi-task RL algorithm with improved learning efficiency and optimality. To achieve the global optimality of task completion, we propose to learn options dependent on the future subgoals via a novel off-policy approach. In order to propagate the rewards of satisfying future subgoals back more efficiently, we propose to train a multi-step value function conditioned on the subgoal sequence which is updated with Monte Carlo estimates of multi-step discounted returns. In experiments on three different domains, we evaluate the LTL generalization capability of the agent trained by the proposed method, showing its advantage over previous representative methods.

translated by 谷歌翻译

Neural Generalized Ordinary Differential Equations with Layer-varying Parameters

Duo Yu , Hongyu Miao , Hulin Wu

分类：机器学习

2022-09-21

深层剩余网络（RESNET）在各种现实世界应用中显示出最先进的性能。最近，重新聚集了重新分解模型并将其解释为连续的普通微分方程或神经模型的解决方案。在这项研究中，我们提出了一个具有层变化参数的神经通用的普通微分方程（神经 - 理）模型，以进一步扩展神经模块以近似离散的重新NET。具体而言，我们使用非参数B-Spline函数来参数化神经形成，以便可以轻松平衡模型复杂性和计算效率之间的权衡。证明重新结构和神经码模型是所提出的神经形模型的特殊情况。基于两个基准数据集，MNIST和CIFAR-10，我们表明，与标准神经模板相比，与层变化的神经形成更加灵活和通用。此外，神经学享有计算和记忆益处，同时在预测准确性方面具有相当的性能。

translated by 谷歌翻译

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

Guimei Cao , Zhanzhan Cheng , Yunlu Xu , Duo Li , Shiliang Pu , Yi Niu , Fei Wu

分类：计算机视觉

2022-07-14

可扩展的网络已经证明了它们在处理灾难性遗忘问题方面的优势。考虑到不同的任务可能需要不同的结构，最近的方法设计了通过复杂技能适应不同任务的动态结构。他们的例程是首先搜索可扩展的结构，然后训练新任务，但是，这将任务分为多个培训阶段，从而导致次优或过度计算成本。在本文中，我们提出了一个名为E2-AEN的端到端可训练的可自适应扩展网络，该网络动态生成了新任务的轻量级结构，而没有任何精确的先前任务下降。具体而言，该网络包含一个功能强大的功能适配器的序列，用于扩大以前学习的表示新任务的表示形式，并避免任务干扰。这些适配器是通过基于自适应门的修剪策略来控制的，该策略决定是否可以修剪扩展的结构，从而根据新任务的复杂性动态地改变网络结构。此外，我们引入了一种新颖的稀疏激活正则化，以鼓励模型学习具有有限参数的区分特征。 E2-aen可以降低成本，并且可以以端到端的方式建立在任何饲喂前架构上。关于分类（即CIFAR和VDD）和检测（即可可，VOC和ICCV2021 SSLAD挑战）的广泛实验证明了提出的方法的有效性，从而实现了新的出色结果。

translated by 谷歌翻译